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Remarks 

Claims 1-7 are pending in the application. Claims 1-7 are rejected. All 
rejections are traversed. Claim 3 is canceled. 

The specification is objected to. The specification has been amended to 
correct the informalities indicated by the Examiner. 

S £00611 56 \ t ), hoebcftcr Sbsbats, aad &rther in »t r ^ ^ 



The claims, as amended, claim updating a hierarchical hidden. Markov 
model for each set of features extracted from a video. 



The Examiner states: 

up tH t h } i j n 1 > 1 <- i f ' » { t < 1 ! F in i 

schematic view .showmg a hierswdacal m&dd of \-\dm data: and pans. 0052, sec Fig. U: 

With all due respect. Fig. 1 in Shibata only shows a hierarchical structure of 
a video: 

"[00523 1« the present invention, it is assumed that video data of an object of 
processing has such a modeled data structure as shown in FIG, 1 wherein it has 
three hierarchical layers of frame, segment and scene. In particular, the video data 
is composed of a series of frames in the lowermost hierarchical layer. 'Further, the 
video data is composed of segments, each of which is formed from a series of 
successive frames, in a higher hierarchical layer. Furthermore, die video data is 
composed of scenes, each of which is formed from segments collected based on a 
significant relation, in the highest hierarchical layer " 
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Shibata does not show a hierarchical hidden Markov model (HHMM) as 
claimed. Frames, segments and scenes are not hierarchical hidden Markov 
models. 

According to the MPFJP, "During patent examination, the pending claims 
must be 'given their broadest reasonable interpretation consistent with the 
specification.' The Federal Circuit's en banc decision in Phillips v. AWM 
Corp., 415 F.3d 1303, 75 USPQ2d 1321 (Fed. Cir. 2005) expressly 
recognized that the USPTO employs the 'broadest reasonable interpretation' 
standard... Indeed, the rules of the PTO require that application claims must 
'con form to the invention as set forth in the remainder of the specification 
and the terms and phrases used in the claims must find clear support or 
antecedent basis in the description so that the meaning of the terms in the 
claims may be ascertainable by reference to the description.' 37 CFR 
.1 .75(d)(1).. . During examination, the claims must be interpreted as broadly 
as their terms reasonably allow. In re American Academy of Science Tech 
Center, 367 F.3d 1359, 1369, 70 USPQ2d 1827, 1834 (Fed. Cir. 2004) (The 
USPTO uses a different standard for construing claims than that used by 
district courts; during examination the USPTO must give claims their 
broadest reasonable interpretation in light of the specification. )." 

In the above, emphasis is used to indicate that the Examiner must interpret 
the claims in light of the specification. The MPEP does not say that the 
Examiner is permitted to interpret the claims according to the cited art. 
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Formally defined, a hierarchical hidden Markov model (HHMM) is a 
statistical model derived from the hidden. Markov model (HMM). In a 
HHMM. each state is considered to be a self-contained probabilistic' model. 
More precisely, each stat e of t he HHMM is itself a HHMM, which makes it 
hierarchical, see S. Fine, Y. Singer and N. Tishby, 'The Hierarchical Hidden 
Markov Model: Analysis and Applications," Machine Learning, vol. 32, p. 
41-62, 1998, and below: 

We now give a formal description of an HHMM Let >J be a finite alphabet. We denote 
bv >J* the set of ail possible strings over L\ Aa observation sequence is a finite string fom 
cf denoted byH >, >> > T . A state of an HHMM *> h novd , . {i , f >\ . 
wiser? i is tl 1 - die root 

1 smd of the production state*, is. D. The mteraal states need not hare v he same number 
<<: -nictate \\> rh~tei>< - denotr bw i;uu;l<- - j!>s->t^ ,i m nt-nrta, taoe » b< 
Whenever it is clear from, the contest, we omit the state index and denote a state at level d 
by q d . In addition to its model structure {topology), an HHMM is characterized by die state 
transstsoij probability between the internal states and the output distribution s ector of the 
production t - I bar i h - h iten 1 t - j 1. . . .. . D --- 1} h there ss a state 

■iaitoUOii ptu* abdil m-atis. dn. !>• w t 1 > -Wieie . ' <!*><, ° rv She/ 

probability of mafcsua a horizontal transition from the ;th state to the y th , both of which are 
^ubsMt^ of< bsmilarh If" 1 » 

vector over the substates of b\ winch is die probability that state ,f will initially activate 
th - t - sis turn an uiteinal rsUue then > - nneipieled 

as the probability of making a vertical transition: entering obstate « 1 f m its parent 
state <f. Each production state q° is solely parameterized by its ontput probability vector 
/'K' i», ^ vl ' '<> . ' ~ ' is dk j j tl hr bit ] { \ -ten r 

ty' will output the symbol fr k 6 £ The entire set of parameters is denoted by 

a - i^hm^m - U^Wo, j>~>y !0 / j„v^.if P^H ■ 
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H.HM.M 

Therefore, Shibata cannot evaluate an information gain of each feature on a 
Viterbi state sequence with respect to a reference state sequence of the 
hierarchical statistical model. 

Applicants find the Examiner's argument that Shibata shows the claimed 
HHMM. limitations unperstiasive. To assist understanding the difference, the 
cited art and the invention are distinguished below. 



Shibata 's hi erarchy 



(A) i 



Invention's HHMM 
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The Examiner states: 

evalyasmg :sr mfonnation gain of the hietarcltitsi statistics! model [para. 0104, 
v rris mdm extract „ u< s < Uft* mj u ggmsnl < <^ »ioR&i 

vcctof ... histogram and a pmver sp««trum arc involved; para. 0105 ■ 0106); 



Applicants cannot find any reference to "a statistic. . . n-dimensional. . . 

histogram. . . power spectrum" in the reference paragraphs: 

"[0105] In the first case, the sample number is determined to be k in advance, and 
the video-audio processing apparatus uses a well-known k-means-clusteri ng 
method disclosed in L, Kaufman and P. J. Rousseeuw, "Finding Groups in Data: 
An Introduction to Cluster Analysis", John-Wiley and sons, 1990 to automatically 
divide the feature amounts regarding the entire segment into groups each 
including k feature amounts. Then, the video-audio processing apparatus selects, 
from each group of k samples, a sample whose sample value is equal or proximate 
to a centroid of the group. The complexity of the processing by the video-audio 
processing apparatus increases merely linearly in proportion to the sample 
number. 

[0106] Meanwhile, in the second case, the video-audio processing apparatus uses 
a k-medoids algorithm method disclosed in L. Kaufman and P. J. Rousseeuw, 
"Finding Groups in Data: An Introduction to Cluster Analysis", John- Wiley and 
sons, 1990 to form groups of k samples. Then, the video-audio processing 
apparatus uses, as a sample value for each of the groups of k samples, a medoid of 
the group described above/' 

Furthermore, there is nothing in the claim about "a statistic. . .n- 
dimensional . . .histogram . . . power spectrum." The Examiner's is -non. 
seq tutor. 

Next, the invention claims filtering redundant features. The Examiner cites 

paragraph [0101]; 

**[0T01] The video-audio processing tppa ttu ^ >s racts more than one static 
feature amount from different points of time within one segment, for example, as 
seen from FIG. 5, In this instance, the video-audio processing apparatus 
determines the extraction number of feature amounts by balancing maximization 
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of the fidelity and minim ization of the data redundancy in the segm ent 
representation. For example, where a certain one image in a segment can he 
designated as a key frame of the segment, a histogram calculated from the key 
frame is used as sample feature amounts to he extracted." 

The meaning of filtering redundant features is plan and simple. But certainly 
neither "balancing maximization of the fidelity" nor "minimization of the 
data redundancy in the segment representation" have anything to with 
filtering redundant features in a HHMM With all due respect, the 
Examiner's analogy makes no sense. 

The invention claims updating the HHMM model based on the filtered 
features. The Examiner again cites paragraphs [0104-0107]. Forming a 
dissimilarity measurement for a feature does not update a HHMM as 
claimed. Furthermore, Shibata has no statistical model to update. 

Shibata does not apply a Bayesian transformation to each HHMM. 

There is no support in the Office Action for the statement: 

< i > ' . ( r j •■ 

Applicants cannot find anything 'supra 5 in the Office Action that details 
"rank ordering the (HHMM) model and feature set pairs to leant the 
structure and detect the events in the video in an unsupervised manner, rank 
ordering of a statistical model, and event detection in the Officer Action, or 
in Shibata. 
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The Examiner's comment, supra, is merely eonclusionary. As is recognized 
in MPEP 707.07(d), "omnibus rejection of the claim ...is usually not 
informative and should therefore be avoided/' MPEP 707.07(f) further 
mandates that "where a major technical rejection is proper, it should be 
stated with a full development of the reasons rather than by a mere 
conclusion coupled with some stereotyped expression.*' The rejection by the 
Examiner is a mere concl usion, without a full development of reasons. 
MPEP 706.07 further makes clear that "the invention as disclosed and 
claimed should be thoroughly searched in the first action and the references 
should be fully applied." In the present application, the rejection foils not 
only to provide a reasonable rational as to how, in the Examiner's view, the 
applied art can be construed to teach each and every feature in the rejected 
claims, but the rejection also fails to even consider explicitly claimed 
features of the invention as recited in claim 1 . 

From the above, it is clear that Shibata does not describe any of the claimed 
limitations. Therefore, Shibata cannot make the in vention obvious Choi does 
not describe hierarchical hidden Markov models. Therefore, Shibata in 
combination with Choi can also not make the invention obvious. 

Choi does not teach applying Bayesian information criteria to each HHMM 
and feature set pair. Choi describes using Bayesian techniques for self- 
organizing feature maps in the form of a neural network. There is no 
evidence that the Choi Bayesian techniques can be applied to HHMMs as 
claimed. 
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Choi deals with organizing documents. Choi is irrelevant for the problem 
faced by Shibata, namely determining a structure of a video. A person or 
ordinary skill in the art would never consider Choi as pertinent: art to solve 
this problem. A combination of Shibata and Choi is technically impossible. 
Choi can never fail the numerous fatal defects of Shibata. 

With respect to claim 2, the Gaussian distributions in Choi apply to network 
parameters. [0029] in a preferred embodiment of the present invention, the 
prior information determined in the sixth step takes the form of probability 
distribution, and the network parameter has a Gaussian distribution. Again, 
neither Shibata nor Chois have HMMMs that can be expressed as Gaussian 
distributions. A combination of Shibata and Choi is not feasible. 

;l I ( ) . t tv. £ 1 ( i 5 t $ 1 i h !0£ i O 3 [V 11H 

fante m vfcwoJ hnatl US 7,0? IB or Oawre a! (US 

•t04^ \ W \h hcK'5(ii.H« amiV>r SSutM. Walter cUii . tDL ll\222\2 , f haur 

With respect to claims 3 and 4, neither Ozer nor Sterz update HHMMs from 
features extracted from a video. Lin deals with detecting events, not learning 
structure. Furthermore, Lin is inappropriate because in Lin the events are 
detected from objects that have been segmented from the video. The 
example that Lin uses is a falling body, and the object parts are the person's 
head, torso or legs. Obviously, Lin is not a feature-based system. Sterz deals 
with speech recogni tion, which is totally irrelevant to the problem of 
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Shibata. Furthermore, speech and video signals are completely different and 
techniques such as Sterz cannot be applied to signals processed by Shibata. 

With respect to claim 5, the Examiner rejects that claim on Offical Notice, 
but provides no support for this assertion. MPEP 2 1.44.03 states: 

"It would not be appropriate for the examiner to take official notice of 
facts without citing a prior art reference where the facts asserted to be 
well known are not capable of instant and unquestionable 
demonstration as being well-known. For example, assertions of 
technical facts in the areas of esoteric technology or specific 
knowledge of the prior art must always be supported by citation to 
some reference work recognized as standard in the pertinent art. ..,;/« 
re Eyride, 480 f .2d 1364, 1370, 178 USPQ 470, 474 (CCPA 1973) 
CfW]e reject the notion that judicial or administrati ve notice may be 
taken of the state of the art. The facts constituting the state of the art 
are normally subject to the possibility of rational disagreement among 
reasonable men and are not amenable to the taking of such notice.') " 

To generate a HHMM from dominant color ratios, motion intensities, least- 
square estimates of camera translation, audio volumes, spectral roll-offs, 
low-band energies, high-band energies, zero-crossing rates (ZCR) is not 
capable of such a demonstration. Generating HHMM, in general, is an 
extremely esoteric technology and difficult area to understand, for example, 
see Examiner's interpretation of HHMMs in the Office Action. Assertions 
that certain procedures in the field are well known must be supported by 
evidence as required by the MPEP. 
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Furthermore, MPEP 2144.33 goes onto state: 

"If such notice is taken, the basis for such reasoning must foe set forth 
explicitly. The examiner must provide specific factual findings 
predicated on sound technical and scientific reasoning to support his 
or her conclusion of common knowledge. See Soli, 317 F.2d at 946, 
37 USPQ at m;Chevenanl, 139 F.2d at 713, 60 USPQ at 241. The 
applicant should be presented with the explicit basis on which the 
examiner regards the matter as subject to official notice and be 
allowed to challenge the assertion in the next reply after the Office 
action in which the common knowledge statement was made." 

The Examiner sets forth no explicit reasoning as to why he believes 
generating HHMM from dominant color ratios, motion intensity, a least- 
square estimate of camera translation, audio volume, spectral roll-off, low- 
band energy, high-band energy, zero-crossing rate (ZCR) is well-known, 
specifically in light of what the Examiner beliefs to be an HHMM. As such, 
the Examiner is depriving the Applicants of the ability to challenge his 
assertion , which is clearly in violation of the directi ves of the MPEP. The 
Examiner is respect fully requested to either provide evidentiary support and 
to clearly state his rationale in support of his assert ions or to withdraw his 
rejections. 

HI ( ka f < ^ - N it i I t u p It 1 1 <• v.' , C v 

20Q2006J m AVh hereinafter mbm, in view *f Choi, (US 20020042793 Al), a»d further m 
MvA ol Ura i-ti n r Bremer 
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Claimed are features filtered with a Markov blanket. Bremer deals with 
mood disorders and manic depression in suicidal patents. It is highly 
unlikely, that Shibata or Choi would look to Bremer's mentally ill. patents to 
solve the problem of determ ining a structure of a video. A combination of 
Bremer and Shibata would indeed be bizarre. 

26020061 U6 AO, h-erelaafter Shibata, in view of Choi. (US 200200427*3 At% mii further m 
view of Aittchukr et aj., (US 6,012,052), feerrinaAer Atts&utar. 

Claimed is evaluating an information gain in a HHMM using expectation 
maximization and a Markov chain Monte Carlo method. 

Again, the Examiner reaches for incompatible art, which has not relationship 
to the primary references. First note, that neither Shibata not Choi have 
HHMMs, neither does Altsehuler. In Altsehuler, a Markov chain is used to 
sample a joint distribution of resource transition probabilities in a network, 
not to evaluate an HHMM of a video. The EM algorithm is used to estimate 
free parameters. 
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The Exam in er states: 

therefore it would ha been obvious eoi kil t ie time the 

invention was made to apply "feature amounts, hierarchical model of video data, statistic 
representative value of an entire segment and minimizations of the data tukmimxcy disclosed by 
Shibata m combination with 'Bayesian self organizing feature maps (SOM}\ and ^Bayesian 
statistical technique" disclosed by Choi, and motivated to combim ho tea ><■ > eSbd 
is directed toward audio visual signal processing of video data, and al&eOgh Choi is directed 
toward performing real-time document clustering for relevant documents in accordance with a 
deg ree of semantic similarity , Choi would i »pr< i retrieva via 

performiag real-time c ts re> d in para, f 02, and further 

coupled with "Markov Chain Monte Carlo algorithms' and ^Expectation Maximization (or 
1 x< jorithm* disclosed by Altsc aid motivated I j m i 

methods to pre- fetch resources and build resource link topoiog> templates rutty also be used for 
collaborative filtering as disclosed by Aitschuler in cob 3, Ins. 60-67. 



As best as can be understood, the Examiner states that it would be obvious to 
combine Shibata with Choi and Altscrmler, because the Choi Bayesian 
techniques for rank ordering Web pages and AJschuler's EM algorithm for 
making consumer preferences using collaborative filtering would improve 
an understanding of the structure the Shibata video. This is indeed 
interesting but unpersuasive, and with all due respect makes no sense, 
clarification and amplification is respectfully requested. 



It is believed that this application is now in condition for allowance. A 
notice to this effect is respectfully requested. Should further questions arise 
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concerning this application, the Examiner is invited to call Applicants' 
attorney at the number listed below. Please charge any shortage in fees due 
in connection with the fi l i ng of this paper to Deposit Account 50-0749. 

Respectfully submitted, 

Mitsubishi Electric Research Laboratories, inc. 

By 

/Dirk Brinkman/ 



Dirk Brinkman 
Attorney for the Assignee 
Reg. No. 35,460 

201 Broadway, 8 th Floor 
Cambridge, MA 02139 
Telephone: (617) 621-7517 
Customer No. 022199 
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